45 research outputs found
Embedding Comparator: Visualizing Differences in Global Structure and Local Neighborhoods via Small Multiples
Embeddings mapping high-dimensional discrete input to lower-dimensional
continuous vector spaces have been widely adopted in machine learning
applications as a way to capture domain semantics. Interviewing 13 embedding
users across disciplines, we find comparing embeddings is a key task for
deployment or downstream analysis but unfolds in a tedious fashion that poorly
supports systematic exploration. In response, we present the Embedding
Comparator, an interactive system that presents a global comparison of
embedding spaces alongside fine-grained inspection of local neighborhoods. It
systematically surfaces points of comparison by computing the similarity of the
-nearest neighbors of every embedded object between a pair of spaces.
Through case studies, we demonstrate our system rapidly reveals insights, such
as semantic changes following fine-tuning, language changes over time, and
differences between seemingly similar models. In evaluations with 15
participants, we find our system accelerates comparisons by shifting from
laborious manual specification to browsing and manipulating visualizations.Comment: Equal contribution by first two author
VisText: A Benchmark for Semantically Rich Chart Captioning
Captions that describe or explain charts help improve recall and
comprehension of the depicted data and provide a more accessible medium for
people with visual disabilities. However, current approaches for automatically
generating such captions struggle to articulate the perceptual or cognitive
features that are the hallmark of charts (e.g., complex trends and patterns).
In response, we introduce VisText: a dataset of 12,441 pairs of charts and
captions that describe the charts' construction, report key statistics, and
identify perceptual and cognitive phenomena. In VisText, a chart is available
as three representations: a rasterized image, a backing data table, and a scene
graph -- a hierarchical representation of a chart's visual elements akin to a
web page's Document Object Model (DOM). To evaluate the impact of VisText, we
fine-tune state-of-the-art language models on our chart captioning task and
apply prefix-tuning to produce captions that vary the semantic content they
convey. Our models generate coherent, semantically rich captions and perform on
par with state-of-the-art chart captioning models across machine translation
and text generation metrics. Through qualitative analysis, we identify six
broad categories of errors that our models make that can inform future work.Comment: Published at ACL 2023, 29 pages, 10 figure
Assessing the Impact of Automated Suggestions on Decision Making: Domain Experts Mediate Model Errors but Take Less Initiative
Automated decision support can accelerate tedious tasks as users can focus
their attention where it is needed most. However, a key concern is whether
users overly trust or cede agency to automation. In this paper, we investigate
the effects of introducing automation to annotating clinical texts--a
multi-step, error-prone task of identifying clinical concepts (e.g.,
procedures) in medical notes, and mapping them to labels in a large ontology.
We consider two forms of decision aid: recommending which labels to map
concepts to, and pre-populating annotation suggestions. Through laboratory
studies, we find that 18 clinicians generally build intuition of when to rely
on automation and when to exercise their own judgement. However, when presented
with fully pre-populated suggestions, these expert users exhibit less agency:
accepting improper mentions, and taking less initiative in creating additional
annotations. Our findings inform how systems and algorithms should be designed
to mitigate the observed issues.Comment: Fixed minor formattin
Striking a Balance: Reader Takeaways and Preferences when Integrating Text and Charts
While visualizations are an effective way to represent insights about
information, they rarely stand alone. When designing a visualization, text is
often added to provide additional context and guidance for the reader. However,
there is little experimental evidence to guide designers as to what is the
right amount of text to show within a chart, what its qualitative properties
should be, and where it should be placed. Prior work also shows variation in
personal preferences for charts versus textual representations. In this paper,
we explore several research questions about the relative value of textual
components of visualizations. 302 participants ranked univariate line charts
containing varying amounts of text, ranging from no text (except for the axes)
to a written paragraph with no visuals. Participants also described what
information they could take away from line charts containing text with varying
semantic content. We find that heavily annotated charts were not penalized. In
fact, participants preferred the charts with the largest number of textual
annotations over charts with fewer annotations or text alone. We also find
effects of semantic content. For instance, the text that describes statistical
or relational components of a chart leads to more takeaways referring to
statistics or relational comparisons than text describing elemental or encoded
components. Finally, we find different effects for the semantic levels based on
the placement of the text on the chart; some kinds of information are best
placed in the title, while others should be placed closer to the data. We
compile these results into four chart design guidelines and discuss future
implications for the combination of text and charts.Comment: 11 pages, 4 tables, 6 figures, accepted to IEEE Transaction on
Visualization and Graphic
Bluefish: A Relational Framework for Graphic Representations
Complex graphic representations -- such as annotated visualizations,
molecular structure diagrams, or Euclidean geometry -- convey information
through overlapping perceptual relations. To author such representations, users
are forced to use rigid, purpose-built tools with limited flexibility and
expressiveness. User interface (UI) frameworks provide only limited relief as
their tree-based models are a poor fit for expressing overlaps. We present
Bluefish, a diagramming framework that extends UI architectures to support
overlapping perceptual relations. Bluefish graphics are instantiated as
relational scenegraphs: hierarchical data structures augmented with adjacency
relations. Authors specify these relations with scoped references to components
found elsewhere in the scenegraph. For layout, Bluefish lazily materializes
necessary coordinate transformations. We demonstrate that Bluefish enables
authoring graphic representations across a diverse range of domains while
preserving the compositional and abstractional affordances of traditional UI
frameworks. Moreover, we show how relational scenegraphs capture previously
latent semantics that can later be retargeted (e.g., for screen reader
accessibility).Comment: 27 pages, 14 figure
Intuitively Assessing ML Model Reliability through Example-Based Explanations and Editing Model Inputs
Interpretability methods aim to help users build trust in and understand the
capabilities of machine learning models. However, existing approaches often
rely on abstract, complex visualizations that poorly map to the task at hand or
require non-trivial ML expertise to interpret. Here, we present two visual
analytics modules that facilitate an intuitive assessment of model reliability.
To help users better characterize and reason about a model's uncertainty, we
visualize raw and aggregate information about a given input's nearest
neighbors. Using an interactive editor, users can manipulate this input in
semantically-meaningful ways, determine the effect on the output, and compare
against their prior expectations. We evaluate our interface using an
electrocardiogram beat classification case study. Compared to a baseline
feature importance interface, we find that 14 physicians are better able to
align the model's uncertainty with domain-relevant factors and build intuition
about its capabilities and limitations